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NETWORK PROCESSOR SYSTEM 

Brian A. Petersen 
Mark A. Ross 



BACKGROUND OF THE INVENTION 

Field of the Invention 

This invention relates to the field of network processors, specifically network 
processors adapted to perform packet processing. 

10 Description of the Related Art 

In the data networking field there exists a long felt need to provide faster 
packet processing using fewer system resources and more efficient hardware. Those of 
ordinary skill in the art have long realized that a programmable processing system can 
be readily adapted to provide packet processing. However, such systems are typically 
15 implemented in custom or semi-custom application specific integrated circuits 

(ASICs) which are difficult and costly to develop and produce. Furthermore, such 
ASICs are not readily changeable in the event that packet configurations, processing 
requirements, or standards change over time. 

What is needed is a rapidly adaptable packet processing system able to be 
20 easily configured to perform a wide range of packet processing tasks without redesign 
or reconstruction of the processor system hardware itself. 

SUMMARY 

Presently disclosed is a general purpose; software-controlled central processor 
augmented by a set of task specific, specialized peripheral processors (simply referred 
25 to as "peripherals"). The central processor accomplishes its software-determined 
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functions with the support of the peripheral processors. Peripheral processors may 
include but are not limited to a packet parser, which provides the central processor 
with a numerical summary of the packet format; a packet deconstructor, which 
extracts designated fields from the packet, the positions of which are determined by 
5 the central processor according to the packet format; a search engine, which is 
supplied a lookup index by and returns its results to the central processor; and a 
packet editor which modifies the packet as determined by the central processor using 
the previously-identified information from other peripherals. 

At each step in the use of this network processor system, the central processor 
10 has an opportunity to intervene and modify the handling of the packet based on its 
current interpretation of peripheral processor results. The programmable nature of the 
central processor and the peripheral processors provides the system with flexibility 
and adaptability. Rather than having to modify a circuit or system design in an ASIC 
or other complex hardware device, new packet processing applications may be 
15 accommodated through the development of new software and its deployment in the 
central and/or peripheral processors. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present disclosure may be better understood and its numerous features and 
advantages made apparent to those skilled in the art by referencing the accompanying 
20 drawings. 

Figure 1 is a high-level block diagram of the central processor/peripheral 

processor architecture according to one embodiment of the present invention. 

Figure 2 is a flowchart of the sequence of events by which a packet is processed 
according to one embodiment of the present invention. 

25 The use of the same reference symbols in different drawings indicates similar or 
identical items. 
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DETAILED DESCRIPTION 
Architecture 

The network packet processor system of one embodiment of the present 
invention comprises a central processor (CP) and a set of peripheral processors (PP). 
5 In some embodiments of this architecture, the peripheral processors each 

communicate only with the central processor; they do not communicate with each 
other. In other embodiments, the PPs can share information either passed from the CP 
or derived within one or more PPs with other PPs. The CP acts as the coordinating 
and controlling processor while each peripheral processor performs specialized tasks 
1 0 with high efficiency. The advantage of this architecture is that the individual 

processor (CP and PP) workflows, tasks, and functions are completely modularized 
and configurable by appropriate processor programming. 

Figure 1 shows a high-level block diagram of the present central/peripheral 
processor system architecture 100 for packet processing. Central processor 110 

1 5 receives packets through any of a number of means well-known in the art. central 
processor 110 performs, in some embodiments, preliminary format checking, e.g., 
checksum validation, and passes the packet or parts of the packet to one or more 
peripheral processors for additional work, central processor 110 may pass data to one 
or more peripheral processors 120, 130, 140, and 150 in sequence, in parallel, or in a 

20 pipelined fashion. 

Central processor 1 1 0 is a general purpose programmable processor, such as 
(but not limited to) an embedded processor core available from Tensilica, Inc. of 
Santa Clara, California or Advanced RISC Machines (ARM) Ltd. of Cambridge, 
England. In some embodiments of the present invention, the embedded core forming 
25 central processor 1 10 is part of an application specific integrated circuit (ASIC). 

In one embodiment of the present invention, shown in Fig. 1 , four peripheral 
processors 120, 130, 140, and 150 are employed. One of ordinary skill in the art will 
readily see that fewer or more PPs may be employed without deviating from the spirit 

-3 - 
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of the present invention. Accordingly, the present architecture is not limited to a 
certain number of peripheral processors. 

Peripheral processors 120, 130, 140, and 150 may each be implemented 
independently in any form of processing module or ASIC known in the electronic arts. 
5 For instance, any PP may be a collection of discrete, fixed (hard-wired) logic, a 

programmable or fixed state machine, a microsequencer or microprocessor, a stored 
program-controlled processor using either ROM or RAM storage or a combination 
thereof, or a general-purpose, fully programmable computer. Any implementation 
form may be selected according to the tasks and functions of each PP and network 
10 packet processor system 100 overall. Accordingly, the present invention is not limited 
in the physical implementation of any PP. 

In some embodiments of the present invention, central processor 1 10 and one 
or more PPs are contained in the same ASIC. 

Sequence Of Events 

15 In the embodiment of Fig. 1 , the four PPs are packet parser 120, packet 

deconstructor 130, search engine 140, and packet editor 150. Each performs specific 
functions at the request of central processor 110 and returns results to central 
processor 110. 

Packets are received and delivered simultaneously to packet parser 120. A 
20 buffer (not shown) may also be employed to provide latency compensation, as is well- 
known in the art. Packet error detection code(s), such as the well-known cyclic 
redundancy check (CRC) field, are verified if present. Reception errors are flagged 
and included as part of a status word that is associated with the packet by packet 
parser 120. 

25 The packet is deposited into a latency buffer primarily to allow a minimum 

amount of data to accumulate for address lookup purposes. The latency buffer makes 
the receive packet data available to packet deconstructor 130 and central processor 
110 prior to the packet being stored in a central packet data buffer (not shown). 

-4- 
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Packet parser 120 takes a quick look at the received packet and assigns a 
"vector" to the packet that indicates to central processor 1 10 in which of several 
categories (based on, e.g., packet formats) the packet belongs. A vector, as used here, 
is an identifying number or data field, such as simple byte code "0xF8" (F8 in 
5 hexadecimal notation). The vector can be one or more bits, bytes, or words. This 
provides central processor 110 a head start in the processing of the receive packet. 
Knowing the packet vector, central processor 1 10 knows where in the packet the 
fields of interest are located without having to examine the packet itself. This 
knowledge is stored in central processor 1 10, in one embodiment, using templates that 

1 0 indicate the desired fields for each vector, i.e., for each type of packet. Operationally, 
if the packet conforms to one of severed expected formats as indicated by the vector, 
the appropriate processing template held within packet deconstructor 130 is selected 
by central processor 1 10. Packet deconstructor 130 executes the selected template by 
reading the required data directly from the latency buffer using pointers maintained by 

1 5 the latency buffer. 

Packet deconstructor PP 130 delivers one set of selected fields to central 
processor 110 and accumulates a (possibly different) set of fields into a search 
argument that it delivers (in some embodiments) directly to search engine PP 140. In 
other embodiments, the accumulated search argument is delivered to search engine 
20 1 40 via central processor 110. 

In either event, the search argument is used to extract routing information, 
such as the destination port, MAC address, or IP address (as appropriate to the routing 
level of interest) from the routing data structures, which in some embodiments consist 
of tables. Various types of packet routing lookups can be performed by search engine 
25 140, such as the well-known OSI Layer 2, Layer 3, and/or Layer 4 lookups. The 

search yields search results that are returned to central processor 110. Typically, only 
one of the lookups results in a destination determination; the layer 2 destination 
address lookup, in particular, determines which lookup identifies the packet's next 
destination. Central processor 110 has the option of examining the search results and 

-5- 
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modifying the destination returned by the lookups as necessary, in case of error or 
exception. 

Editor PP 150 uses the information derived from parser 120, packet 
deconstructor 130, search engine 140, and central processor 1 10 to modify the packet 
5 (especially its header) in order to guide the packet to its next destination. This is the 
last step of the well-known routing/switching function performed by most packet 
processing systems. 

Before the packet is forwarded by the switch/router, it is stored (buffered) in a 
packet data buffer (not shown). Such storage, including all necessary queuing, flow 
10 management, buffer management, retrieval and outbound (egress) forwarding and the 
like, may be accomplished by any of a number of means well-known in the packet 
processing and networking arts. Accordingly, packet storage (and subsequent 
retrieval) will not be further discussed herein. 

Figure 2 is a flowchart of the sequence of events discussed above. Packet 
1 5 processing 200 begins with packet reception 210 and buffering 220 to accommodate 
latency. Packet parsing 230 is next accomplished to determine a packet vector by 
which the packet is internally identified. 

Processing coordination and control 240 evaluates the packet vector and 
passes the packet (either directly or by reference) to packet deconstructing step 250. 
20 Packet deconstructing 250 deconstructs the packet into its constituent parts, e.g., 
header fields, quality of service (QoS) bits, packet data payload, etc. The results of 
deconstructing 250 are passed back to processing step 240 and, in some embodiments, 
directly to searching (lookup) step 260. 

Lookup results from search step 260 are returned to processing step 240 where 
25 they are used to control packet editing step 270. The revised packet is then sent for 
storage and forwarding 299 by means well-known in the art. 

At any time in process 200, processing step 240 may, upon evaluation of the 
results of any PP step 230, 250, 260, or 270, redirect or alter the processing scheme 
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according to its own (i.e., the central processor's) programming. Such redirection may 
occur, for instance, on an error or exception condition, such as the failure of a packet 
to pass a CRC check or the receipt of an illegal format. 

Alternate Embodiments 

5 While central processor 1 1 0 is described as a single, monolithic processor, 

nothing in the architecture of the present invention so limits its implementation. In 
particular, central processor 110 can be formed as an interconnected network or mesh 
of two or more processors acting in concert. These processors forming central 
processor 1 10 may be implemented in the same ASIC or other integrated circuit 
10 device or on multiple ASICs or other integrated circuit devices. Such multi-processor 
implementations of a single processing function (such as that of central processor 110) 
are well-known to those of ordinary skill in the art. 

Furthermore, while central processor 110 may be implemented as one or more 
interconnected processors, the peripheral processors, as a group, may also be 
1 5 implemented in one or more "sets" of PPs in order to pipeline or parallel packet 
processing across multiple peripheral sets under the control of a single central 
processor entity. As with central processor 110, the above-described PPs may be 
implemented on one or more ASICs or other integrated circuit devices. 

In a further alternative embodiment, central processor 110 and the peripheral 
20 processors (in one embodiment, PPs 120, 130, 140, and 150) share a common set of 
registers in order to speed up data transfer between them and calculations using the 
same data. Some or all of the registers used by central processor 110 and all or some 
of the peripheral processors may be logically mapped to the same memory locations 
or otherwise shared by means long known in the computer and 
25 microcomputer/microprocessor arts. 

The order in which the processing steps of any embodiment of the present 
invention are performed is purely illustrative in nature. In fact, these steps can be 
performed in any order or in parallel, unless otherwise indicated by the present 
disclosure. 
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The method of the present invention may be performed in either hardware, 
software, or any combination thereof, as those terms are currently known in the art. In 
particular, the present method may be carried out by software, firmware, or microcode 
operating on a computer or computers of any type. Additionally, software embodying 
5 the present invention may be in the form of computer instructions in any form (e.g., 
source code, object code, interpreted code, etc.) stored in any computer-readable 
medium (e.g., ROM, RAM, magnetic media, punched tape or card, compact disc (CD) 
in any form, DVD, etc.). Furthermore, such software may also be in the form of a 
computer data signal embodied in a carrier wave, such as that found within the well- 
10 known Web pages transferred among computers connected to the Internet. 

Accordingly, the present invention is not limited to any particular platform, unless 
specifically stated otherwise in the present disclosure. 

While particular embodiments of the present invention have been shown and 
described, it will be apparent to those skilled in the art that changes and modifications 
15 may be made without departing from this invention in its broader aspect and, 

therefore, the appended claims are to encompass within their scope all such changes 
and modifications as fall within the true spirit of this invention. 
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CLAIMS 

We claim: 

1 . A method of packet processing comprising: 

parsing a packet, said packet having a header portion, to determine a vector; 
5 coordinating processing using said vector; 

deconstructing said packet header to form header data; 
searching one or more data structures based on said header data to produce 
search results; 

editing said packet based on said search results, said header data, and said 
10 vector; 

wherein said coordinating further comprises monitoring said deconstructing, said 
searching, and said editing. 

2. The method of Claim 1, wherein said coordinating further comprises 
sharing data with said parsing, said deconstructing, said searching, and said editing. 

15 3 . The method of Claim 1 , further comprising buffering said packet 

before said parsing. 

4. The method of Claim 1 , wherein: 

said deconstructing further comprises forming a search argument; and 
said searching uses said search argument. 

20 5. The method of Claim 1, wherein: 

said deconstructing further comprises forming a search argument; 

said coordinating further comprises operating on said search argument to form 

a modified search argument prior to said searching; and 
said searching uses said modified search argument. 

25 

-9- 
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6. An apparatus for packet processing, comprising: 

a central processor for packet processing, said central processor comprising a 
register set; and 

one or more peripheral processors each connected to said central processor and 
5 each comprising a register set, wherein each said peripheral processor 

returns at least one datum to said central processor; 
wherein said central processor communicates with each said peripheral processor. 

7. The apparatus of Claim 6, wherein said central processor comprises a 
general purpose processor. 

10 8. The apparatus of Claim 6, wherein said central processor comprises a 

microsequencer. 

9. The apparatus of Claim 6, wherein said central processor comprises 
more than one processor acting in concert. 

1 0. The apparatus of Claim 6, wherein one or more of said peripheral 
15 processors comprise fixed logic circuits. 

1 1 . The apparatus of Claim 6, wherein one or more of said peripheral 
processors comprise programmable logic circuits. 

12. The apparatus of Claim 6, wherein one or more of said peripheral 
processors comprise a programmable state machine. 

20 13. The apparatus of Claim 6, wherein a portion of each said peripheral 

register set is mapped onto said central processor register set. 

14. The apparatus of Claim 6, wherein said central processor and at least 
one peripheral processor together form at least a part of a single application specific 
integrated circuit. 

25 
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15. A computer system for packet processing, comprising computer 
instructions for: 

parsing a packet, said packet having a header portion, to determine a vector; 
coordinating processing using said vector; 
deconstructing said packet header to form header data; 
searching one or more data structures based on said header data to produce 
search results; 

editing said packet based on said search results, said header data, and said 
vector; 

wherein said coordinating further comprises monitoring said deconstructing, said 
searching, and said editing. 

16. The computer system of Claim 15, wherein said coordinating further 
comprises sharing data with said parsing, said deconstructing, said searching, and said 
editing. 

17. The computer system of Claim 15, further comprising buffering said 
packet before said parsing. 

18. The computer system of Claim 15, wherein: 

said deconstructing further comprises forming a search argument; and 
said searching uses said search argument. 

19. The computer system of Claim 15, wherein: 

said deconstructing further comprises forming a search argument; 

said coordinating further comprises operating on said search argument to form 

a modified search argument prior to said searching; and 
said searching uses said modified search argument. 
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20. A computer-readable storage medium, comprising computer 
instructions for: 

parsing a packet, said packet having a header portion, to determine a vector; 
coordinating processing using said vector; 
deconstructing said packet header to form header data; 
searching one or more data structures based on said header data to produce 
search results; 

editing said packet based on said search results, said header data, and said 
vector; 

wherein said coordinating further comprises monitoring said deconstructing, said 
searching, and said editing. 

2 1 . The computer-readable storage medium of Claim 20, wherein said 
coordinating further comprises sharing data with said parsing, said deconstructing, 
said searching, and said editing. 

22. The computer-readable storage medium of Claim 20, further 
comprising buffering said packet before said parsing. 

23 . The computer-readable storage medium of Claim 20, wherein: 
said deconstructing further comprises forming a search argument; and 
said searching uses said search argument. 

24. The computer-readable storage medium of Claim 20, wherein: 
said deconstructing further comprises forming a search argument; 

said coordinating further comprises operating on said search argument to form 

a modified search argument prior to said searching; and 
said searching uses said modified search argument. 
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25. A computer data signal embodied in a carrier wave, comprising 
computer instructions for: 

parsing a packet, said packet having a header portion, to determine a vector; 
coordinating processing using said vector; 



searching one or more data structures based on said header data to produce 
search results; 

editing said packet based on said search results, said header data, and said 
vector; 

10 wherein said coordinating further comprises monitoring said deconstructing, said 
searching, and said editing. 

26. The computer data signal of Claim 25, wherein said coordinating 
further comprises sharing data with said parsing, said deconstructing, said searching, 
and said editing. 

1 5 27. The computer data signal of Claim 25, further comprising buffering 

said packet before said parsing. 



5 



deconstructing said packet header to form header data; 



28. The computer data signal of Claim 25, 
said deconstructing further comprises forming 
said searching uses said search argument. 



wherein: 



a search argument; and 



20 



29. The computer data signal of Claim 25, wherein: 

said deconstructing further comprises forming a search argument; 

said coordinating further comprises operating on said search argument to form 



a modified search argument prior to said searching; and 
said searching uses said modified search argument. 



25 
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NETWORK PROCESSOR SYSTEM 

Brian A. Petersen 
Mark A. Ross 

5 

ABSTRACT OF THE DISCLOSURE 

The present invention consists of a general purpose, software-controlled 
central processor (CP) augmented by a set of task specific, specialized peripheral 
processors (PPs). The central processor accomplishes its functions with the support of 

10 the PPs. Peripheral processors may include but are not limited to a packet parser, 

which provides the central processor with a numerical summary of the packet format; 
a packet deconstructor, which extracts designated fields from the packet the positions 
of which are determined by the central processor according to the packet format; a 
search engine, which is supplied a lookup index by and returns its results to the central 

1 5 processor; and a packet editor which modifies the packet as determined by the central 
processor using (in part) information returned from other peripherals. At each step in 
the use of this network processor system, the central processor has an opportunity to 
intervene and modify the handling of the packet based on its interpretation of PP 
results. The programmable nature of the CP and the PPs provides the system with 

20 flexibility and adaptability: rather than having to modify a circuit or system design in 
an ASIC or other hardware, new packet processing applications may be 
accommodated through the development of new software and its deployment in the 
central and/or peripheral processors. 
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this document, or the validity or enforceability of any patent, trademark registration, or certificate 
resulting therefrom. 
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Attorney Docket No.: M-7907 US 



1. 

Full name of first joint inventor: Petersen, Brian A. 



Date: /^/iZcy/^^ 



Inventor's Signature: ■ /y *^ 5 " — uaie: / ^ o/ ^ ^ 

Residence: San Francisco, CA 

Post Office Address: 1 5 1 Alice B. Toklas Place, #512 Citizenship: United States of 

San Francisco, CA 94109 America 



2. 

Full name of joint inventor: Ross, Mark A. 



Inventor's Signature: 17%^ A Date: iT-klMl 

Residence: San Carlos, CA 

Post Office Address: 3344 Melendy Lane Citizenship: United States of 

San Carlos, CA 94070 America 
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